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ABSTRACT 

JPEG 2000 Part 2 (Extensions) contains a number of technologies that are of potential interest in remote sensing applications. 
These include arbitrary wavelet transforms, techniques to limit boundary artifacts in tiles, multiple component transforms, 
and trellis-coded quantization (TCQ). We are investigating the addition of these features to the low-memory (scan-based) 
implementation of JPEG 2000 Part 1. A scan-based implementation of TCQ has been realized and tested, with a very small 
performance loss as compared with the full image (frame-based) version. A proposed amendment to JPEG 2000 Part 2 will 
effect the syntax changes required to make scan-based TCQ compatible with the standard. 
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1. BACKGROUND 

Early in the development of the JPEG 2000 standard, a decision was made to divide the technology into Part 1 (Core Coding 
System) and Part 2 (Extensions). Part 1 contains the features that all decoders must support, in order to be called JPEG 2000 
compliant. These include (9x7) and (5x3) wavelet filters with a Mallat decomposition tree, scalar quantization, and three- 
component color space transforms. There was also a requirement that all technologies accepted for Part 1 would be offered 
by their originators on a royalty-free, non-discriminatory basis. Technologies that were considered too complex, too limited 
in their application, or potentially subject to license fees, were placed in Part 2. Unlike Part 1, the Part 2 technologies do not 
have to be supported as a group by all decoders. One or more Part 2 technologies may be added to a Part 1 decoder to make 
it Part 2 compliant. The other parts of the JPEG 2000 standard - Part 3 (Motion JPEG 2000), Part 4 (Conformance Testing), 
Part 5 (Reference Software), and Part 6 (Compound Imagery) - will not be discussed here. 

2. THE SCAN-BASED MODE 

For technology development purposes, the JPEG 2000 algorithm is embodied in the Verification Model (VM) software, 
which is maintained by Science Applications International Corporation (SAIC) and the University of Arizona (UA). Early 
versions of the VM required the entire image to be retained in memory during computation of the compressed file. Later 
versions required only that the entire image be buffered in the compressed domain, in order to achieve effective rate control. 
This configuration is sometimes referred to as the “frame-based mode.” 

Representatives of the remote sensing community pointed out that airborne and satellite-borne instruments have extremely 
limited memory, owing to size, weight and power constraints. Moreover, many remote sensing instruments are pushbroom 
scanners, which naturally build up a large image one line at a time. For these applications, it is desirable to have a^JPEG 
2000 implementation that buffers up the smallest possible number of image lines. This configuration is called the scan- 
based mode.” 

On the basis of two experiments performed by SAIC/UA and the Centre National d’Etudes Spatiales (CNES), SAIC 
integrated an implementation of the scan-based mode into the VM. 1 ' 2 This implementation, which incorporated only Part 1 
features, has been described in detail by Flohr et al. 3 The rate control buffer, which is the largest buffer in the frame-based 
mode, is set up to contain a user-selectable number of scan elements, where a scan element may be either a tile or a precinct. 
A sliding window rate control is then effected by truncating the scan elements in the buffer to achieve the desired bit rate. As 
a new scan element enters the buffer, bytes are released from the scan element at the head of the window. 


3. PART 2 FEATURES IN THE SCAN-BASED MODE 


The technologies included in JPEG 2000 Part 2 are primarily intended for certain niche markets. Remote sensing is such a 
market, and several Part 2 technologies are especially applicable to remote sensing situations. 


3.1 The wavelet transform 


Whereas Part 1 allows only two wavelet filters and one decomposition tree, the user can specify any arbitrary wavelet in 
Part 2. Experience has shown that for synthetic aperture (SAR) data, improved visual quality can often be obtained by using 
a longer filter and a more detailed decomposition tree. 4-5 One such decomposition, the packet decomposition, is compared 
with the standard 5-level Mallat in Figure 1 . 
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Figure 1. Comparison of Mallat vs. packet wavelet decomposition structures 


A second wavelet feature of interest in remote sensing is the use of the single sample overlap discrete wavelet transform 
(SSODWT) 6 or, alternatively, the "‘odd tile/low pass first” convention (OTLPF) 7 to reduce boundary artifacts at tile edges. 
Although precincts generally give better image quality than tiles in the scan-based mode, they do allow limited error 
propagation between scan elements. Because of the continuity of the wavelet transform, a bit error in one precinct will cause 
lower-amplitude errors in neighboring precincts, according to the formula 

e\ - 2ej-\ +k - 2 (1) 


where ej is the extent of errors in level /, e^\) is the extent of errors in the previous level, and k is the length of the longest 
synthesis filter. Figure 2 gives an example of this error propagation. 



Figure 2. Error propagation as a function of resolution level for two JPEG 2000 filters 



Thus if error containment is the primary concern, as it may be in some remote sensing situations where the compressed 
imagery is to be transmitted over a noisy channel, tiles may be preferred as scan elements despite the possibility of boundary 
artifacts. Under such circumstances, artifact reduction techniques such as SSODWT and OTLPF may be useful. 

The VM implementation of the wavelet transform, including the Part 2 options, is not incompatible with the scan-based 
mode. However, it currently buffers more than the minimum number of image lines required to complete the sliding window 
transform. (This minimum is on the order of the maximum vertical filter length.) Some optimization in terms of memory 
management may be required to obtain the best results for the scan-based mode. 

3.2 The multiple component transform 

JPEG 2000 Part 1 specifies two transforms, the reversible and irreversible multiple component transforms, that may be 
applied to the first three components of an image (although as many as 16K components may be present). But multispectral 
and hyperspectral imagery play a large part in remote sensing science, and these multi-component images are highly 
correlated in the third (wavelength) dimension. JPEG 2000 Part 2 allows two types of multiple component transform: an 
arbitrary linear transform (including the Karhuenen-Loeve [KLT] and Differential Pulse Coded Modulation [DPCM] 
transforms) and a wavelet transform in the third dimension, which is performed independently of the two-dimensional spatial 
wavelet transform. 

The scan-based mode takes advantage of one of the five progression orders allowed in JPEG 2000, namely progression by 
location, to transmit an image a few lines at a time. The second term in this order is progression by component, so that in fact 
all the components of a scan element will be output before the encoder begins on the next scan element. Thus it is possible to 
perform a multiple component transform within a single scan element (even if the scan element is a precinct). However, 
transforms that require the collection of statistics over the whole image - like the KLT - are clearly ruled out. Other linear 
transforms, and the third-dimension wavelet transform, are compatible with the scan-based mode. As in the case of the two- 
dimensional wavelet, some optimization may be required. 

3.3 Trellis-coded quantization (TCQ) 

Trellis-coded quantization (TCQ) may be thought of as time-varying scalar quantization, or as an approach to vector 
quantization. 8 It has been shown to produce better visual quality than scalar quantization, especially for detected SAR 
imagery. 10 So despite its increased complexity, TCQ may be desirable for some remote sensing applications. 

In the frame-based mode, the step sizes for TCQ are determined by a Lagrangian rate allocator (LRA), which models the 
statistics for the entire image. The LRA may be used in a single pass, but better results are obtained when the rate allocator is 
allowed to iterate until it achieves the target bit rate (within a user-selectable tolerance). 

In the scan-based mode, iterated rate control is unacceptable because of the need for maximum throughput. In order to 
achieve effective single-pass rate control, it is necessary to compute the quantization step sizes separately for each scan 
element. If precincts are used as scan elements, rather than tiles, this procedure is known as precinct-dependent 
quantization.” It is applicable to various forms of scalar quantization, as well as to TCQ. Some minor syntax changes are 
required in JPEG 2000 Part 2, in order to signal the step sizes on a precinct-by-precinct basis. Since Part 2 has already 
reached the Final Draft International Standard (FDIS) level, these changes have been presented as an amendment to the 
standard. In the normal course of events, the amendment will become final in May 2002. 

Unlike the scan-based implementation of Part l, 3 the rate control buffer for Part 2 is part of the quantization object, if explicit 
quantization is being used. This buffer holds only one scan element at a time (Figure 3). 




Figure 3. Scan-based TCQ flow diagram 


In our implementation, the LRA collects statistics for the First scan element in the image. For this first scan element, Si, the 
"target rate", R h of the LRA is set equal to T,, the desired global rate for the image as a whole. The rate actually achieved 
after compression of S| is At. (Rj, T|, and A) are measured in bits per pixel (bpp).) Let D] be the size of the initial input file 
and B, be the size of the desired output file after compression. (D, is measured in pixels and B, is measured in bits.) 


For subsequent scan elements, the target rate is modified, based on performance on the preceding scan elements. Thus, for 
the second scan element, S 2 , we set 

r 2 - ^ (2) 

where D 2 is the input file size after removal of S|, and B 2 is the remaining space in the output file after compression of S,. 
Then the "target rate" is modified such that 

* 2 = 724 - (3) 

A \ 

Statistics are then gathered for S2 and LRA is performed using a target rate of R2. More generally, for the n scan element, 
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where B n and D n are the remaining input and output file sizes, respectively, after compression of scan element n-L LRA is 
then performed using statistics from the n* scan element for a target rate of R n - 


We are also experimenting with the introduction of a damping term to limit the fluctuations of R n , such that 
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where (3 > a. 

This scan-based implementation of TCQ was tested on the four remote sensing images from the JPEG 2000 test set: aerial 1 
(cropped to 5K x 5K), aerial2, sari, and sar2. PSNR for the scan-based TCQ implementation was compared with PSNR for 
frame-based TCQ. As described above, there was no iteration in the rate control for the scan-based mode, while the LRA in 
the frame-based mode was allowed to iterate. Table 1 shows the difference in PSNR between the single-pass scan-based 
mode and the iterated frame-based mode, averaged over four images, as a function of bit rate. The performance difference is 
very small. 

Table 1 . Performance difference between full TCQ and scan-based TCQ for four remote sensing images 



Results for aerial 1 are shown graphically in Figure 4. 



Figure 4. PSNR Comparison for aerial 1 

Although our current implementation demonstrates the capability of scan-based TCQ, it is not entirely appropriate for on- 
board use with a pushbroom scanning sensor. We have made use of the input file size and desired output file size of the 
image as a whole, which would not be available in a pushbroom scanner, where the number of image lines to be compressed 
is usually not known at the outset. We plan to modify our approach in the near future to eliminate the use of global input and 
output file sizes in the rate allocator. 




4. CONCLUSION 

Several features in JPEG 2000 Part 2 are of potential benefit for remote sensing applications. These features can be 

implemented in a low-memory (scan-based) mode. 

Upon completing the scan-based implementation of the JPEG 2000 Part 2 features described here, we plan to port our 

software to a flight simulation environment where it can be demonstrated under realistic conditions for on-orbit use. It is 

hoped that this exercise will hasten the day on which PEG 2000 may come into use in satellite-borne applications. 
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